Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reserve metric exporter name prefix #361

Merged
merged 1 commit into from
Aug 16, 2023
Merged

Conversation

acrmp
Copy link
Member

@acrmp acrmp commented Aug 15, 2023

Description

  • Error when rendering OTel Collector config if the component identifier for metric exporters contains '/cf-internal'.
  • Reserving this name prefix for potential later use.

Sample deployment time failure:

Task 83 | 02:33:18 | Error: Unable to render instance groups for deployment. Errors are:
  - Unable to render jobs for instance group 'windows2019-cell'. Errors are:
    - Unable to render templates for job 'otel-collector-windows'. Errors are:
      - Error filling in template 'config.yml.erb' (line 3: Metric exporters cannot be defined under cf-internal namespace)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Testing performed?

  • Unit tests
  • Integration tests
  • Acceptance tests

Checklist:

  • This PR is being made against the main branch, or relevant version branch
  • I have made corresponding changes to the documentation
  • I have added testing for my changes

If you have any questions, or want to get attention for a PR or issue please reach out on the #logging-and-metrics channel in the cloudfoundry slack

- Error when rendering OTel Collector config if the component identifier
  for metric exporters contains '/cf-internal'.
- Reserving this name prefix for potential later use.
@acrmp acrmp requested a review from a team as a code owner August 15, 2023 02:46
Copy link
Contributor

@chombium chombium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

It would be great to have a separate, reserved prefix for " platform internal metrics", something like platform component metrics, filter and process metrics based on a predefined namespace or even forward the metrics based on the namespace to a platform monitoring stack.

@acrmp
Copy link
Member Author

acrmp commented Aug 16, 2023

@chombium Thanks for the feedback. That sounds like a slightly different feature request. In this PR we're really just trying to ensure that we can reserve the ability to add exporters without risk of conflicting with other exporters defined by the operator.

What you're suggesting sounds interesting to consider for the future though.

@acrmp acrmp merged commit b1c1005 into main Aug 16, 2023
5 checks passed
@acrmp acrmp deleted the reserve-metric-exporter-name-prefix branch August 16, 2023 19:09
@chombium
Copy link
Contributor

@acrmp Thanks for the clarification. I totally agree with what you have written.

I always hated that the application container metrics and application logs are mixed together with the platform component metrics and in the Firehose Architecture, some noisy application can cause drops of other log envelopes including platform metrics. The aggregate Syslog drains fix the noisy neighbor problem, but the consumers still get a mix of all metrics and have to do the filtering themselves.

Therefore with the Otel collector, I'm thinking of separating platform component metrics in their own pipeline, possibly adding some custom processors to do some metrics' content adjustments and sending only platform component metrics to the Platform Monitoring System whatever that is.

Or another thing is on-the-fly inspection of the metrics with a custom processor and sending alerts (via Alerting System) immediately as close to the source of the metrics.

OTel will open another level of possibilities for monitoring CF. I'm really excited about the improvements which we can bring to the logging and metrics stack and the observability of CF. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants